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Mandelbrot  (see  [1]  for  example)   has  suggested  that  many  real-world  ran- 
dom processess--f irst  differences  of  stock  market  prices  over  time,  the  size 
distribution  of  income,  etc   -are  particular  instances  of  processes  charac- 
terized by  "stable  non-Gaussian"  probability  laws  called  "Pareto-Levy"  laws 
(cf:   M.  Loeve  [2]). 

One  particular  member  of  the  class  of  probability  laws  stable  in  the 
Pareto-Levy  sense  is  the  (strong)  Pareto  process.   It  is  defined  as  a 
stochastic  process  generating  independent  random  variables  x^...,x..,.  with 
identical  Pareto  density  functions 

0  if       X.  <  X 

1  ~  o 

fcjC^Ja.^o)   =     S  .   i=l,2,...    (1) 

I     «  -(a+1)    _         ^  ^ 
I   ax  X.        if      X,  >  X 

V-      O  1  1      O 

where  a  >   0   and  x  >  0  are  parameters  of  the  above  density  function.   The 
o 

density  function  (1)  possesses  properties  which  make  life  difficult  for  the 

Bayesian:   when  1  <  a  <  2  the  variance  of  f   is  infinite  and  when  '^  <  a  <  1, 

—  a  ~ 

both  the  mean  and  variance  of  f   are  infinite. 

a 

Our  interest  here  is  in  the  following  question:   How  do  we  derive 
distribution- theoretic  results  needed  for  a  "Bayesian"  analysis  of  decision 
problems  where  the  consequences  of  adopting  one  of  a  set  of  available 
(terminal)  acts  depends  on  the  true  underlying  value  of  a,    and  where  information 


about  a   can  be  obtained  by  observation  of  values  of  x^...,x.^...  when  the 
x.'s  are  independent  random  variables,  identically  distributed  according 

to  (D? 

We  follow  the  pattern  of  [3]  (ASDT)  and  present  the  following: 

1.  Definition  of  and  properties  of  Pareto  density  function. 

2.  Definition  and  characterization  of  the  Pareto  process 
and  its  properties. 

3.  Likelihood  of  a  sample  and  the  sufficient  statistic. 

4.  Conjugate  prior  distrilniLlor.  ol;  ;i:  aid  binary  operation 
for  going  from  the  prior  pai\-i..ci.ur  aud  sample  statistic 
to  the  posterior  parameter. 

5.  Conditional  distribution  of  the  sufficient  statistic  for 
a  given  value  of  a   and  a  fixed  size  sample  from  the 
process , 

6.  Unconditional  sampling  distribution  of  the  sufficient 
statistic . 

7.  Some  facts  about  the  (prior)  distribution  of  the  mean 
of  the  process  when  a  is  not  known  with  certainty. 

A  subsequent  note  will  discuss  further  aspects  of  analysis  of  decision  prob- 
lems under  uncertainty,  when  the  underlying  data  generating  process  is  of 
the  strong  Pareto  type. 


1 =   The  Pareto  Function 

The  Pareto  density  function  is  defined  by  (1) .   The  first  k 

moments  of  f  about  the  origin  are 

a 

/+  CO        if         k  >  a 

~k   \ 
E(x'')  =\  .  (1.2) 

r^^o  if         k  <  a 

(a-k 

Proof;   To  prove  that  f   is  a  density  function,  note  that  it  is  positive 

for  all  X  3  (x  ,oo)  and  that 
o' 


w  00 

j^f^(z|a,x^)dz  =ax^"j^z-<«-*-^>dz 


X  X 

o  o 


ax  "  [-lim  -^  +  -^] 

az        ax 
o 

(ax^  )  (ax^  )   =  1  , 


To  prove  (1.2),  we  have 


00 


E(x^    =j   z^f^(z|a,x^)dz 

X 

o 

00 

_    a    r    k-a-K 
=  ax  z  dz 

°  J 

k-a  X  '^■^^ 

=  ax«  [lin>    f—     ~     -f-]      . 
o     '■r  >oo    k-a  k-a 


~k 
Thus    if    k  >  a    ,    E(x   )    =  +  oo   ,    and   if   k  <  a    , 

^  ~k.       ax 
E(x  )    =  o 

a-k 


The  incomplete  kth  moment  about  the  origin  follows  directly  and  is 

for  z  >  X  , 

o  -  o' 


if       k  >  a 


00  ~k 
o 


a  k-a 
ax  z 
o  o 

a-k 


if 


k  <  a 


(1.3) 


The  cumulative  distribution  function  is 


F^(x|a,x^)  = 


1  -  (I  )-^ 

o 


(1.4) 


2.   The  Pareto  Process 


The  Pareto  process  is  defined  as  a  stochastic  process  generating 
independent  random  variables  x  ,,,.^x.,...  with  identical  Pareto 


density  functions 


f  (x. |a,x  ) 
a    1    o 


a   -(a+1) 

ax   X. 

O     L 


a  >  0  , 
"  >  0  , 

X.>  X  , 

1     O 


(2.1) 


From  (1.2)  we  may  obtain  the  mean  and  variance  of  the  process; 


E(x|a,x^)  = 


+  00 


ax 


if     a  <  1 


o_  if     a  >  1 

a-1 


(2.2) 


and 


1   + 


v(x|a,x^) 


ax 


(a- 2)  (a-1)' 


if      a  <  2 


if      a  >  2 


(2.3) 


Proof:  Formula  (2.2)  follows  from  setting  k=l  in  (1.2).  Formula  2,3  is 
determined  by  use  of  the  fact  that 

V(x|a,x^)  =  E(x  |a,x^)  -  E  (x|a,x^)  ,   and 
that 

00 

v(x|a,x^)  s J   [z  -  E(x|a,x^)]^  f^(z|a,x^)dz  . 

By  (1.2)   :  0  <  a  <  1  then  E(x|a,x  )  =  +  «  .   Thus  for  all  values  of 
the  integrand  z,    [z   -  E(x|('  ,x  )]   is  unbounded  and  hence  V(x|a,x  )  =  +  » 
If  1  <  a  <  2  then  V(x|a,x  )  =  +  «  for 

v(x|a,x^)  =  E(x  |a,x^)  -  E  (x|a,x^)  , 


and 


E(x  |a,x  )  =  +  00   and   E  (x|a,x  )  <  oo 


by  (1.2).  When  a  >  2,  by  (1.2)  again 


2        2 

ax   -  ax 
o     g 


a-2     a-1 
2 


ax 
o 


(a-2) (a- I) ^ 


A  "unique  and  important"  property  of  a  Poisson  process  is  that 
if  the  independent  random  variables 

u. ^ . . . )^.> • • • 
are  generated  by  a  Poisson  process,  then 

P(u^  >  I  +u|u^  >  I)  =  P(u^  >  u),    |">'o''*'  (2.4) 

This  is  interpretable  as  "independence  of  past  history/'  and  is  a 
unique  property  of  the  Poisson  process  among  all  processes  generating 
continuous  random  variables  x  >  '-  .   (See  [1]  for  discussion  of  thi 
property) .  By  analogy,  it  is  easy  to  show  that  for  independent 

random  variables  x. ,...,x  , ...  generated  according  to  (1)  and 


P(x  >  ^'x|x  >  i')    =   P(x.  >  X)  =  (x/x  )"°'  ,    i-1,2,... 

1       '     X  1  o'x>x      J         \         ^ 


and  that  the  (strong)  Pareto  process  is  the  only  process  among  all 

processes  generating  continuous  random  variables  x.  >  x  >  0  to 

1     o 

possess  this  property. 

Proof:   Make  the  integrand  transform  u.  =  log  (x./x  )  in  (1)  and 
1  10 

observe  that  (2.5)  implies  (2.4)  and  conversely  since  this  transform 

u. 

is  one  to  one  from  x.  to  u.  and  from  a.  to  x  ,  since  x  e   =  x  . 

11  1     i'        o       i 

3.   Independent  Pareto  Process  When  x  is  Known 

o 


3.1      Likelihood  of  a  Sample  When  x     is   Known 

o 


The  likelihood  that  an  independent  Pareto  process  will  generate 

r  successive  values  x, ,x„...x  is  the  product  of  their  individual 

1'  2    r 


likelihoods  (1) 


to-o")'  Illl  -ll''"-'"   .  (3.1) 


If  we  define  the  statistic 

r 
t  =   [  ^n^  X.]    ,  (3.2) 

we  may  write  (3.1)  as 

,   a.r  -(a+1) 

(ax^  )  t  '   ^  .  (3.1)' 

Alternatively,  (3.1)  may  be  written  as 

la  e  (3.3)" 

where 

i  =   1/t      and     \  £  log  (t/x  ^)    .  (3,4) 

e    o 

Since  |  in  no  way  depends  on  the  unknown  parameter  a,  it  is  clear  that 

a.   e  (3.5) 

is  the  kernel  of  the  likelihood.   If  the  sampling  process  is  non-inform- 
ative then  clearly  (r^t)  is  a  sufficient  statistic  when  x  is  known. 

3.2.   Conjugate  Distribution  of  g 

When  X  is  known  but  a  is  regarded  as  a  random  variable,  the  most 
convenient  distribution  of  a  is  a  gamma- 1  distribution  which  may  be 
written  as 

7^         r(r)  r  >  0  , 

where  \  =  log^(t/x  ).   If  such  a  distribution  with  parameter  (r',t') 
has  been  assigned  to  a  and  if  then  a  sample  from  the  process  yields  a 


sufficient  statistic  (r^t)  the  posterior  distribution  of  a  will  be 
gamma-l  with  parameter  (r",t")  where 

r"  =  r'  +  r  ,      t"  =  t't   ,  (3.7a) 

or  alternatively  (r",X")  where 

\"  =  \'  +  \   .  (3.7b) 

Proof;   Formula  (3.6)  follows  directly  from  the  discussion  of  sufficiency. 
Formulas  (3.7)  follow  from  Bayes  Theorem;  i.e.  the  posterior  density  of 
a  is  proportional  to  the  product  of  the  kernel  of  (3.6),  the  prior  density 
of  a  and  the  kernel  (3.5)  of  the  sample  likelihood: 

^..^  I  I   I     X     r  "^    r'-l  -\a 
D"(a|r',t' jr,t)  ~  a  e    .a    e 

r+r'-l  -a(\+\') 
=  a      e 


where 


r  r 

\  =  log  (t/x  )   and  \ '  =  log  (t'/x   ) 
°e    o  °e     o 


so  that  if  we  define  r"  =  r'  +  r  and 


\"  =  \  +\'  =  log   [(-i-)  (-^,)] 
e     r     r   ■■ 

X       X 

o     o 

=  log^  (f/x^'")   , 

the  posterior  density  has  a  kernel  of  the  form  (3.5). 


The  mean,  variance,  and  partial  moments  of  the  distribution  of  a   are 
from  ASDT: 

E(3|r,t)  =a  =^  ,  (3.7) 


E  "(3|r,t)  =  a  F  ^(at|r+L)   ,  (3.8) 

V(3|r,t)  =  -^  .  (3.9) 

t 


In  ASDT  it  is  shown  that  the  linear  (in  a)  loss  integrals  are 

a 

L^(a)  =J  (a-z)  f^^(z|r,t)dz  = 
0 

=  a  F  ^(Q:t|r)  -  a   F  ^(at|r+l)  (3.10a) 

and 

CO 

Lj.(a)  «/  (z-a)  f  ^(z|r,t)dz 
a 

=  a  G  (at|r+l)  -  a  G  ^(at|r)  (3.10b) 

although  these  are  not  loss  integrals  in  which  we  will  be  interested 
subsequently. 

3.3  Distribution  of  the  Mean  a 

If  a  >  1  then  the  mean 
ax. 


o 

a  =  —  ■ 
a-1 


for  a  >  1 


of 

^|a   ~    f^(xlQ:,x^) 

exists  by  (1.2).   If  a  is  not  known  with  certainty  then  provided  that  we 
assume  that 

5  ~  f^^(a|r,t)   , 


it  follows  that  for  a  >  1^  (conditional  on  a  >  1), 

2 

a  ~  p   f  j^  (h(a)|r,t)   ^o        ,  a  >  x^   ,      (3.11) 

where 

h(a)  = and  p  =  G  ,  (1  |r,t)   . 

a-x  -yl       '  ' 

o 

Proof;   Since  a  =  ax  /(a-l)  when  a  >  1  , 

a  =  h(a)  =  a/(a-x  ) 
The  function  h(a)  Is  continuous  and  monotonic  decreasing  for  a  >  max  {l,x  }  ^ 


so  that 


i|S>l  ~  p-lf^^  (h(i)[r,t)|  ^1   ,a>x^j 

as 

2 

dh(a)   ^    ""o      , 

da        (a-x  ) 
^   o 

(3.11)  follows  directly. 

Of  more  interest  than  the  loss  integrals  (3.10)  are  loss  integrals 
linear  in  a  --  which,  as  we  would  expect,  have  certain  undesirable  prop- 
erties  if  a  <  1  with  positive  probability.   However,  we  can  make  a  step 
in  the  direction  of  doing  (prior)  terminal  analysis  of  a  two  action 
linear  loss  decision  problem  if  we  assume  that  a  >  1  with  probability  1. 
In  order  to  proceed  in  this  direction  we  will  need  to  know  the 
conditional  expectations  E(a|a  -  e)  and  E(a|a  S  e)  for  values  of  e 


between  0  and  +  »  ,  We  show  that 

+  CO      for  ^  <  1 


m  =  E(a|Q!  >  €) 


""o^   0,^1  rcixpx.'^"^  G  .(e-l|i,l)j 
pr(r)   "-'  ^ 

+  0  (£)1  for  €  >  1   , 

o  ' 


where 


0^(e)  =  /   y'^e'^dy  ,  X  «  log^lt 'x^'')  and  p  =  G^^(e|  r,t)   . 


Proof;   Formula  (3,10)  follows  directly  from  (1.2).   Now  suppose  that 
a  ~  f^^(a|r,t)  =— ^  (\a)   e     ,        ^  >  0 

Then  for  e  >  1  , 

00 

E(i|e>  1)   =  !°  J(-i^)  f^^(z|r,t)  dz 

e 

00 

°     '  (z-l)'^e"^^(\z)''d(\z)  . 


P^r(r)  , 

e 

Letting  u  =  z-1,  we  may  write  the  above  as 


^o^     /   y"-^e"y  [Xu  +xf   d(\u)   , 


P^r(r) 


e-1 


or  letting  y  =  \u,    as 


Since 


V_  f      y'^e'y[  L  (Oy^"""']  dy 

pr(r)  J^^^ 

""o^      5   /^^r-i  r  i-1  -y 

€-1 


00 

V^"'"'''^  ~-Fa)I    y^'^'^^dy   ,   i>0   , 

e-1 


we  may  write 

-X. 


X  e     r        r    . 

o     r  ^  _/ .  X  ^ .  V,  r-1 


E(a|5  >  e)  =  ^p^  {.Z^  r(i)  (i)X  "  G^(e|i,l.  +  C#^ 


where 


00 

o   -1 
0^(e)  =  /   y"  e'^dy  . 

e-1 
For  any  €  >  1  ,  the  integral  immediately  above  is  bounded,  and  is  unbounded 
for  e  =  1. 

4.   Sampling  Distribution  and  Preposterior  Analysis  With  Fixed  Sample  Size  r 

We  assume  that  a  sample  of  size  r  is  taken  from  an  independent  Pareto 
process,  and  that  the  statistic  t  is  left  to  chance.   It  is  also  assumed 
that  X  is  known  and  fixed  and  that  the  parameter  a  has  a  gamma- 1  distri- 
bution of  the  type  (3.6).  We  need  the  following  theorem  in  the  sequel: 


4.4  Convolution  of  r  Pareto  Density  Functions 

We  define  the  (multiplicativs)convolution  g*  of  any  r  density  functions. 


by 


g*  =  f^  *  f^  *...*  f^  ,  (4.1) 


g*(t)  =j  f^(zp  f2(z2)...f(z^)clA  (4.2) 

R 


where  R  is  the  r-1  dimensional  hypersurface 

r 

.n,  z.  =  t   .  (4.3) 

1=1  1 

Theorem:   The  convolution  g*  of  r  >  1  Pareto  densities,  each  with  parameter 

(a,l)  is 

r  >  1 
8*<M«^1)  =  f77)  OL't'^'^^^\lo&^tf-'^      ,  t  >  1   ;      (4.4) 


a  >  0  . 


Proof;  We  prove  this  by  showing  that  if 


\\oL  ~  fa(x|a,x^)  ,  i=l,2,...,r 


and  we  scale  all  x.  into  units  of  x  ,  then  g*  may  be  represented  as  an 

(additive)  convolution  g  of  gamma-1  densities.  We  then  show  that  the  fact 

that  g  is  a  gamma-1  density(ASDT,  p.  224)  implies  that  g*  is  as  stated 

above. 

Define  z.    =  x./x     and  e        =  z  ,      so   that 

1  1      o  1 

~  -  Ca+1  '^  2 .   >  1      , 

?..      'f    (z.|a,l)dz.    =  az      ^^^dz.      ,       a     >0      ,  (4.5) 


or 


•ay. 

a    >  0 

) 

■dy. 

) 

y.    >   0 

) 

i=l,2, . . .r 

i 

y.  ~  f  (y.|Q:)dy.  =  ae   'dy.  ,  y.  >  0   ^         (^.6) 

1    e  1'     1 


or  letting  u.  =  ay.  , 

-u.  u  >  0  , 

u.  ~  f  (u.|l)du.  =  e  ^du.     .     .  ,  /  (4.7) 

1    e^  i'    1        1    '  1=1,2, ...r   . 

By  the  Theorem  on  page  224  of  ASDT,  the  convolution  g  of  the  r  density 
functions  (4.7)  is  a  standardized  gamma-l  density  with  parameter  r: 

~     X     r^  1     -vr-1 

^-  iii  "i~  ^yM""^  =m)  "    "       •  ^""'^^ 

If  we  define 

~^~     ^  '•^  rt  i^~a        '■<i 

v=  .Z.  u.  =  .£,  log  7.^-  =   log   .n,  z.   =  log   t 

1=1  1   1=1   "e  1      *e  1=1  1       e  ' 

cc 
since  a  >  0,  log  t   is  a  monotonic  increasing  function  of  t.   Hence  (4.8) 

implies  that 

8*(t)  =  f^^(log^t°=)  I  1^  log^  t«  I 

1    -[log  t"]  ..        ^a.r-l     ,   -1, 
=  ^)  e  ^   ^   Mlog  t  ]     .   (at   ) 

1     r   -(a+1)  ri    ^^^^-l        ^  >  ^  > 
=r(r)  "  '       t^°Se'^      ^    t>  1   . 

It  follows  from  the  aboved  stated  theorem  that  the  convolution  g*  of  r 

Pareto  densities  each  with  parameter  (a^x  )  is 

X  >  0   , 

g*(t)  =  rj-,  a^  (X  ^)«-^^  t-(^l>  [log  t  -  ^f-^      ,  t°  >  X  ^  ,   (4.9) 
r(r)      o  ea  o'^ 

r  >  0 

r 

where  t  —    .n,  x.   and  k  s  r  log  x 
1=1   1  e  o 


4.2   Conditional  Distribution  of  t|a 

Using  the  results  of  section  4,1^  the  conditional  probability  given  a. 
and  X  that  the  product  of  r  identically  distributed  Pareto  random  variables 
will  have  a  value  t  may  be  written  as 

D(t|a,x^;r)  =  f^(t|a,x^jr) 

=  g*(t)     .  (4.9) 

4o3  Unconditional  Distribution  of  t 

If  a  sample  of  size  r  is  drawn  from  a  Pareto  process  with  unknown 
parameter  a   regarded  as  a  gamma- 1  random  variable  distributed  with  parameter 
(r'^t')  as  shown  in  (3.6)  and  all  sample  observations  are  scaled  into  units 
of  X  ,  then 


r'      logt)''"^ 
D(t|x  =l,r',f;r)  =  i^-2-, 


(4.11a) 


B(r,r')    tL(iog^t)+\']-    ' 

t  >  1   , 
r,r'^\  >  0   , 

where 

\'  -  loe  t'   r.r.   -^    r(r+r')    ,  (4.11b) 

\  =  l°8e'   '  ^(^'^  >  =r(r)  r(r') 

Alternatively,  the  unconditional  distribution  of  the  sufficient  statistic 

y  =  log  t  is  inverted  beta -2: 
e 

D(y|x^=l,r',f;r)  =  f .  ^^(y  |  r,r '  A  '  ) 

_  q')'"'      y"""^      ,  (4.12) 

B(r,r')  •      ,r" 

^  '     (y-+^-) 

y  >  0    , 
r,r',A'  >  0 


Proof;   To  prove  (4,11a)  note  that  from  (4.4)  and  (3.6) 

00 

D(t|x^=l,r',t';r)  =j  g*(tH,a)  f  .^j^(a|  r '  ,  t '  ) 


da 
0 

.  r-1 ,  ,  ^  r '   00 
<^°ge^)    ^^  >     r    r"-l  -X'W 
tr(r)r(r')    J  cc         e         6a 
0 

B(r,r')   °   t[(log^t)  4^']''     ^   ^  >  ^ 


Formula  (4.12)  follows  by  making  the  integrand  substitution  y  =  log  t  in  (4.11a). 
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